Search CORE

30 research outputs found

Publishing Efficient On-device Models Increases Adversarial Vulnerability

Author: Carlini Nicholas
Hong Sanghyun
Kurakin Alexey
Publication venue
Publication date: 28/12/2022
Field of study

Recent increases in the computational demands of deep neural networks (DNNs) have sparked interest in efficient deep learning mechanisms, e.g., quantization or pruning. These mechanisms enable the construction of a small, efficient version of commercial-scale models with comparable accuracy, accelerating their deployment to resource-constrained devices. In this paper, we study the security considerations of publishing on-device variants of large-scale models. We first show that an adversary can exploit on-device models to make attacking the large models easier. In evaluations across 19 DNNs, by exploiting the published on-device models as a transfer prior, the adversarial vulnerability of the original commercial-scale models increases by up to 100x. We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase. Based on the insights, we propose a defense,

similarity

unpairing

, that fine-tunes on-device models with the objective of reducing the similarity. We evaluated our defense on all the 19 DNNs and found that it reduces the transferability up to 90% and the number of queries required by a factor of 10-100x. Our results suggest that further research is needed on the security (or even privacy) threats caused by publishing those efficient siblings.Comment: Accepted to IEEE SaTML 202

arXiv.org e-Print Archive

Handcrafted Backdoors in Deep Neural Networks

Author: Carlini Nicholas
Hong Sanghyun
Kurakin Alexey
Publication venue
Publication date: 08/06/2021
Field of study

Deep neural networks (DNNs), while accurate, are expensive to train. Many practitioners, therefore, outsource the training process to third parties or use pre-trained DNNs. This practice makes DNNs vulnerable to

backdoor

attacks

: the third party who trains the model may act maliciously to inject hidden behaviors into the otherwise accurate model. Until now, the mechanism to inject backdoors has been limited to

poisoning

. We argue that such a supply-chain attacker has more attack techniques available. To study this hypothesis, we introduce a handcrafted attack that directly manipulates the parameters of a pre-trained model to inject backdoors. Our handcrafted attacker has more degrees of freedom in manipulating model parameters than poisoning. This makes it difficult for a defender to identify or remove the manipulations with straightforward methods, such as statistical analysis, adding random noises to model parameters, or clipping their values within a certain range. Further, our attacker can combine the handcrafting process with additional techniques,

e.g.

, jointly optimizing a trigger pattern, to inject backdoors into complex networks effectively

-

the meet-in-the-middle attack. In evaluations, our handcrafted backdoors remain effective across four datasets and four network architectures with a success rate above 96%. Our backdoored models are resilient to both parameter-level backdoor removal techniques and can evade existing defenses by slightly changing the backdoor attack configurations. Moreover, we demonstrate the feasibility of suppressing unwanted behaviors otherwise caused by poisoning. Our results suggest that further research is needed for understanding the complete space of supply-chain backdoor attacks.Comment: 16 pages, 13 figures, 11 table

arXiv.org e-Print Archive

Differentially Private Image Classification from Features

Author: Cutkosky Ashok
Krichene Walid
Kurakin Alexey
Mehta Harsh
Thakurta Abhradeep
Publication venue
Publication date: 23/11/2022
Field of study

Leveraging transfer learning has recently been shown to be an effective strategy for training large models with Differential Privacy (DP). Moreover, somewhat surprisingly, recent works have found that privately training just the last layer of a pre-trained model provides the best utility with DP. While past studies largely rely on algorithms like DP-SGD for training large models, in the specific case of privately learning from features, we observe that computational burden is low enough to allow for more sophisticated optimization schemes, including second-order methods. To that end, we systematically explore the effect of design parameters such as loss function and optimization algorithm. We find that, while commonly used logistic regression performs better than linear regression in the non-private setting, the situation is reversed in the private setting. We find that linear regression is much more effective than logistic regression from both privacy and computational aspects, especially at stricter epsilon values (

\epsilon < 1

). On the optimization side, we also explore using Newton's method, and find that second-order information is quite helpful even with privacy, although the benefit significantly diminishes with stricter privacy guarantees. While both methods use second-order information, least squares is effective at lower epsilons while Newton's method is effective at larger epsilon values. To combine the benefits of both, we propose a novel algorithm called DP-FC, which leverages feature covariance instead of the Hessian of the logistic regression loss and performs well across all

\epsilon

values we tried. With this, we obtain new SOTA results on ImageNet-1k, CIFAR-100 and CIFAR-10 across all values of

\epsilon

typically considered. Most remarkably, on ImageNet-1K, we obtain top-1 accuracy of 88\% under (8,

8 * 10^{-7}

)-DP and 84.3\% under (0.1,

8 * 10^{-7}

)-DP

arXiv.org e-Print Archive

RETVec: Resilient and Efficient Text Vectorizer

Author: Bursztein Elie
Jia Xinyu
Kurakin Alexey
Vallis Owen
Zhang Marina
Publication venue
Publication date: 06/10/2023
Field of study

This paper describes RETVec, an efficient, resilient, and multilingual text vectorizer designed for neural-based text processing. RETVec combines a novel character encoding with an optional small embedding model to embed words into a 256-dimensional vector space. The RETVec embedding model is pre-trained using pair-wise metric learning to be robust against typos and character-level adversarial attacks. In this paper, we evaluate and compare RETVec to state-of-the-art vectorizers and word embeddings on popular model architectures and datasets. These comparisons demonstrate that RETVec leads to competitive, multilingual models that are significantly more resilient to typos and adversarial text attacks. RETVec is available under the Apache 2 license at https://github.com/google-research/retvec.Comment: Accepted at NeurIPS 202

arXiv.org e-Print Archive

Quaternion-Based Self-Attentive Long Short-Term User Preference Encoding for Recommendation

Author: Hamilton William Rowan
He Xiangnan
Kingma Diederik P
Kurakin Alexey
Rendle Steffen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/08/2020
Field of study

Quaternion space has brought several benefits over the traditional Euclidean space: Quaternions (i) consist of a real and three imaginary components, encouraging richer representations; (ii) utilize Hamilton product which better encodes the inter-latent interactions across multiple Quaternion components; and (iii) result in a model with smaller degrees of freedom and less prone to overfitting. Unfortunately, most of the current recommender systems rely on real-valued representations in Euclidean space to model either user's long-term or short-term interests. In this paper, we fully utilize Quaternion space to model both user's long-term and short-term preferences. We first propose a QUaternion-based self-Attentive Long term user Encoding (QUALE) to study the user's long-term intents. Then, we propose a QUaternion-based self-Attentive Short term user Encoding (QUASE) to learn the user's short-term interests. To enhance our models' capability, we propose to fuse QUALE and QUASE into one model, namely QUALSE, by using a Quaternion-based gating mechanism. We further develop Quaternion-based Adversarial learning along with the Bayesian Personalized Ranking (QABPR) to improve our model's robustness. Extensive experiments on six real-world datasets show that our fused QUALSE model outperformed 11 state-of-the-art baselines, improving 8.43% at HIT@1 and 10.27% at NDCG@1 on average compared with the best baseline

arXiv.org e-Print Archive

Crossref

Tiki-Taka: Attacking and Defending Deep Learning-based Intrusion Detection Systems

Author: Abbasi Mahdieh
Kingma Diederik
Kurakin Alexey
Li Yandong
Lu Jiajun
Rauber Jonas
van der Maaten Laurens
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/11/2020
Field of study

Crossref

Edinburgh Research Explorer